shifted QR algorithm implementation for computing roots given coefficients#3
shifted QR algorithm implementation for computing roots given coefficients#3pasquale90 wants to merge 16 commits into
Conversation
justonem0reuser
left a comment
There was a problem hiding this comment.
Thank you,
All the existing tests are passed.
Unfortunately, I didn't go into details of the method itself, so I can only propose some ways for optimization.
-
Have you considered using 1D vectors for
A,Q,Rinstead of "vectors of vectors"?
I believe this would speed up the process significantly.
The drawback is the necessity of manual indexing, for example:Q[i * degree + j]instead ofQ[i][j]. -
I wouldn't perform memory allocation in cycle.
-
The "old-school" optimization (though probably it doesn't make sense nowadays): not to repeat some calculations (sqrt) and prefer multiplication to division.
-
Matrix multiplication can also be accelerated, but it requires more sophisticated optimization.
-
In TODO note it is proposed to change
QRreturn type toboolfor the case when there is a pole outside the unit. Is it really necessary?CoefficientsToRootsis an abstract math class that doesn't solve only the particular filter design task, so it is probably not its responsibility to perform this check.
| for (size_t col = 0 ; col < shift_idx; col++) | ||
| { | ||
| // set v with column A[col] | ||
| std::vector<double> v(shift_idx); |
There was a problem hiding this comment.
| std::vector<double> v(shift_idx); |
There was a problem hiding this comment.
OK, the allocation of this vector is now moved outside the loops e75f2ae
| std::vector<std::vector<double>> Q(degree, std::vector<double>(degree)); | ||
| std::vector<std::vector<double>> R(degree, std::vector<double>(degree)); | ||
|
|
||
| size_t shift_idx {degree}, iter {0}; |
There was a problem hiding this comment.
| size_t shift_idx {degree}, iter {0}; | |
| size_t shift_idx {degree}, iter {0}; | |
| std::vector<double> v(shift_idx); |
| norm = std::sqrt(norm); | ||
|
|
||
| for (size_t k = 0; k < shift_idx; ++k) | ||
| { | ||
| Q[k][col] = v[k] / norm; | ||
| } |
There was a problem hiding this comment.
| norm = std::sqrt(norm); | |
| for (size_t k = 0; k < shift_idx; ++k) | |
| { | |
| Q[k][col] = v[k] / norm; | |
| } | |
| double normReciprocal = 1.0 / std::sqrt(norm); | |
| for (size_t k = 0; k < shift_idx; ++k) | |
| { | |
| Q[k][col] = v[k] * normReciprocal; | |
| } | |
There was a problem hiding this comment.
OK, fixed that here f4bb673
Btw, next time feel free to push the changes yourself , you deserve the credits :)
| if ( discriminant >= 0.0) | ||
| { | ||
| // tow real roots | ||
| addRoot(c128(((a+d) + std::sqrt(discriminant)) / 2.0, 0.0)); | ||
| addRoot(c128(((a+d) - std::sqrt(discriminant)) / 2.0, 0.0)); | ||
| } | ||
| else | ||
| { | ||
| // Complex conjugate pair | ||
| const double re = (a+d) / 2.0; | ||
| const double im = std::sqrt(-discriminant) / 2.0; | ||
| addRoot(c128(re,im)); | ||
| // addRoot(c128(re,-im)); // Note: this is added automatically later using FilterState::add method. Commenting this, removes the bug of overlapping roots. | ||
| } |
There was a problem hiding this comment.
| if ( discriminant >= 0.0) | |
| { | |
| // tow real roots | |
| addRoot(c128(((a+d) + std::sqrt(discriminant)) / 2.0, 0.0)); | |
| addRoot(c128(((a+d) - std::sqrt(discriminant)) / 2.0, 0.0)); | |
| } | |
| else | |
| { | |
| // Complex conjugate pair | |
| const double re = (a+d) / 2.0; | |
| const double im = std::sqrt(-discriminant) / 2.0; | |
| addRoot(c128(re,im)); | |
| // addRoot(c128(re,-im)); // Note: this is added automatically later using FilterState::add method. Commenting this, removes the bug of overlapping roots. | |
| } | |
| const double halfSum = 0.5 * (a + d); | |
| const double halfSqrt = 0.5 * std::sqrt(std::abs(discriminant)); | |
| if ( discriminant >= 0.0) | |
| { | |
| // tow real roots | |
| addRoot(c128(halfSum + halfSqrt, 0.0)); | |
| addRoot(c128(halfSum - halfSqrt, 0.0)); | |
| } | |
| else | |
| { | |
| // Complex conjugate pair | |
| const double re = halfSum; | |
| const double im = halfSqrt; | |
| addRoot(c128(re,im)); | |
| // addRoot(c128(re,-im)); // Note: this is added automatically later using FilterState::add method. Commenting this, removes the bug of overlapping roots. | |
| } | |
|
This is great work Paschalis. I agree with all of Anton's points and suggestions. Indeed compilers will perform common subexpression elimination in optimized builds, but performing it manually can speed up debug builds and simplify refactoring. The memory layout/ access patterns will be completely reworked during the optimization pass, so there's no point worrying about that for this PR. The case of a pole outside the unit circle should be handled in the coefficients editor. When you compute the new poles just scan the vector for any unstable poles. If you find any, don't update the filter state. There should be some indication to the user that coefficient they just entered made the pole unstable, though it's hard to give actionable advice for how to supply coefficients which result in stable filters. Currently, trying to add a pole outside the unit circle will not do anything, which means the editor will display different coefficients from what a user enters. |
…d on the GramSchmidt method. Computes both real and complex roots Cons: Inefficient as number of coefficients scale up. Optimizations and alternative methods that provide a more efficient way to handle input should be considered for providing a practical application of this feature
…ble and maintainable. + added a todo as a comment for improving the data structure that stores the extracted roots in the upcoming updates
1. fix bug in updateFilterStateOnCoefEdit when adding a pole that already exists (increasing its order), and then removing it (removing the previously added new pole) 2. fix index of returned root when degree equals 1.0 3. fix sign of returned root when degree equals 1.0 4. fix coef_index when constructing the companion matrix. Now it always initializes the matrix with the last degree-lengthed coefficients (correctly ignoring the leading zeros and 1.0)
change termination condition, now all subdiagonal values should be below epsilon (not their sum) fix typo : step 3 - A = R Q change tolerance threshold to 5e-2 add order in prints for debugging
trailing zeros in (feedback) coefficients (when adding default poles) are now parsed out of the companion matrix and roots at 0.,0. (of order equal to #numOfTrailingZeros) are appended directly into the root list before building the companion matrix and applying the QR prev_sz is now calculated directly before appending new poles on the filterstate, taking into account that some updated roots may not result in removing previous poles, but changing their order instead leading zeros if (feedforward) coefficients (when total order of poles exceeds total order of zeros) are now tracked using std::numeric_limits instead of comparing doubles with the == operator.
… from GramSchmidt to QR
…). An empirical tolerance variable is used allowing errors of 1 order per 4 orders (scaling per root)
…o-optimization) :
Optimization change: replace 2D vectors for A,Q,R with 1D vectors, applying manual indexing ([row][col] now becomes [row * degree + col]).
Micro-optimization change : Avoid repeated memory allocation by moving the v vector for storing current columns of A outside loops, and setting its size to {max shift_idx}, that equals to degree of the polynomial
|
Thank you for the recommendations . This a very constructive feedback and I totally agree with your points. I was about adding tests before applying any change, but its fine working on them after demystifying the following regarding testsBefore applying any change, I considered adding tests to have a reference and a basis for comparison. I ve been able to push some tests here (50f60fe). These are comparing the total order of the computed roots to the total order of the given ones, with an empirical tolerance for higher order roots ( 1 order of tolerance for each root, i.e. a 4th order root would add up to the total tolerance by 1, a 8th order root by 2 and so on). Here are some concerns about the testing framework of this implementation: TLDR; Single root testing simple and straightforward / multiple root testing black hole and pain.
|
… loop over subdiabonal elements and b) by avoiding double call of sqrt + replacing division with 2.0 by multipying with 0.5 instead on the extractRoots method
Yes, this is a critical optimization step. Replaced 2D vectors for A,Q,R with 1D vectors here e75f2ae Both prime integration tests and manual tests seem to produce the same results. |
…ating causality constraint (at least one pole outside the unit circle - magnitude >= 1.0)
Thanks you @ry-gatoni In this commit 445fdd6 a check before updating the filterstate implemented, avoiding filterstate updates in case of instability. Informing the user about it, is not yet implemented and left as a TODO item. Lmk if that s properly handled and fine to leave as it is for now |
|
A few notes:
So I don't think this should be a unit test per se; the metric for "correctness" should be "do the roots end up in the expected place with the expected orders when you add them to the filter state", since that's what we care about at the end of the day anyway. As for the case of unstable poles, we can discuss how to notify the user on Wednesday. It's fine not having that done for this PR. Once the tests have been modified to check what I described above and they are passing, I think this PR will be good to merge. I'm happy to contribute some of the modifications I've made while reviewing these changes if that would be helpful. |
@acknowledgment : Many thanks to @LizAryslanova and my sincere gratitude for her work in analyzing the algorithm and for her guidance in helping me understand it
Method used to convert polynomial coefficients into roots using a QR method.
Consists of the following steps:
Leftover tasks: